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This  is  a  survey  of  the  basic  properties  of  strong  mixing  conditions 

n 

for  sequences  of  random  variables.  The  focus  will  be  on  the  ^structural'* 
properties  of  these  conditions,  and  not  at  all  on  limit  theory.  For  a 
discussion  of  central  limit  theorems  and  related  results  under  these  con- 

x- - -  ■ 
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ditions,  the  reader  is  referred  to  Peligrad  [60]  or  Iosifescu  [50].  Ibis 
survey  will  be  divided  into  eight  sections,  as  follows: 

\  V 

1]  Measures  of  dependence  1 

\ 

2 l  Five  strong  mixing  conditions^ 

3)  Mixing  conditions  for  two  or  more  sequences^ 

4^  Mixing  conditions  for  Markov  chainSj 
5^  Mixing  conditions  for  Gaussian  sequences j 
6;  Some  other  special  examples  . 

7.  The  behavior  of  the  dependence  coefficients^ 

s 

8.  Approximation  of  mixing  sequences  by  other  random  sequences 

/T/>.  +  .  ml  sir  .  "> 


1.  MEASURES  OF  DEPENDENCE 


Suppose  (Q,F,P)  is  a  probability  space.  For  any  two  o-fields  A  and  8  c  F 
define  the  following  measures  of  dependence: 

a  (A,  8)  :  =  sup  |P(A  n  B)  -  P(A)P(B)|,  A  e  A,  Be  8.  (1.1 

4>(A,8)  :  =  sup  |P(B|A)  -  P(B)  | ,  A  £  A,  BeB,  P(A)>0.  (1.2 

(J>  (A, 8):  =  <H8,A)  ("rev"  stands  for  "reversed").  (1.3 

rev 


'J'(A.B) 

p(A,8) 

8(A,B) 


-  sur  LLCAnB)  -  P(A)P(B) |  A  g 

P  P(A)P(B)  ’  6  ' 

=  sup|Corr(X,Y)|,  X  e  L2(A) ,  YeL2(8);  X,Y  real. 
«  sup  JsE^^^jlPtA.nBp  -  P(A.)P(B.)|. 


where  this  latter  sup  is  taken  over  all  pairs  of  partitions  {A  , . . . ,  Aj)  and 
(B  .....  B  }  of  such  that  A.  £  A  for  all  i  and  B.  eB  for  all  j.  In  (1.4) 

1  J  1  J 

and  in  the  sequel,  0/0  is  interpreted  to  be  0.  These  measures  of  dependence 
will  be  the  basis  for  the  mixing  conditions  that  we  shall  study,  starting 
with  Section  2.  Here  in  Section  1  we  shall  just  study  these  measures  of 
dependence . 

The  following  inequalities  hold: 

2a (A, 8)  5  B(A,B)  s  <J>(A,B)  *  ^(A,8).  (1.7 

4a (A, 8)  <  p(A,8)  s  ip(A,8)  .  (1.8 

p(A,8)  <  2<J>,5(A,8)-<|>^ev(A,8).  (1.9 


p(A,8)  =  sup ||  E(F|8)  -  Ef  II 2  / II f  || 2 ,  f  e  L2CA)  ,  f  real. 
a(A,B)  s  8(A,8)  £  1,  <j>(A,B)  s  1,  p(A,8)  s  1. 


(1.10 

(1.11 


Eqn.  (1.9),  an  improvement  of  the  earlier  well  known  inequality 

p(A,8)  s  2<J>  (A,B) ,  comes  from  Peligrad  [59,  p.  462,  eqn.  (4)];  independently 

the  kindred  inequality  p(A,B)  s  2*max{<|>(A,B) ,  <j>  (A, 8))  was  given  by  Denker 

rev 

and  Keller  [34,  p.  516,  line  -8].  In  this  last  inequality  as  well  as  in 


2 


(1.7),  (1.8),  and  (1.9),  equality  is  achieved  in  some  simple  cases  such  as 
when  A  =  8  =  {ft,A,Ac,<}>}  where  P(A)  =  Eqns.  (1.7),  (1.8),  (1.10),  and 
(1.11)  are  all  either  trivial  or  at  least  fairly  easy  to  prove.  Referring 
to  eqn.  (1.11),  ip(A,B)  can  of  course  take  on  the  value  +°°.  Each  of  the 
measures  of  dependence  in  eqns.  (1.1) -(1.6)  takes  the  value  0  precisely 
when  A  and  B  are  independent  a-fields. 

The  measures  of  dependence  in  eqns.  (1.1) -(1.6)  fit  nicely  into  a  more 
general  framework  using  "norms"  of  the  bilinear  form  "covariance".  For 
any  a-field  A,  letS(A)  denote  the  set  of  all  complex-valued  simple  A- 
measurable  random  variables.  (The  particularly  nice  form  of  eqn.  (1.13) 
below  depends  on  the  use  of  complex-valued  rather  than  just  real-valued 
random  variables;  however,  this  is  not  of  any  special  importance.)  Define 
the  following  families  of  measures  of  dependence  between  pairs  of  cr-fields 
A  and  8: 

For  Osr.ssl, 


“r,s(A’B): 


|P<AnB)  -  FCAJPCBJl  Be8 

[P(A)]r[P(B)]S 


For  1  <p,q  <  °°, 


R  (A,B) :  =  sup  1EXY  ~  E^-EY-i-,  X  e  S(A) ,  YcS(B). 

p’q  IWIpIMiq 


(Note  that  s(*,*)  is  a  variant  of  R^r  using  only  indicator 

functions.)  Obviously  the  measures  of  dependence  in  eqns.  (1.1)-(1.5)  are 
respectively  aQ  Q(A,8),  a1>0(A>B) ,  a0>1(A»8)»  ^(A.B),  and  R2  2(A,B) . 

(The  equation  p(A,8)  =  R2  2(A,8)  holds  by  [75,  Theorem  1.1]  and  a  simple 
calculation.)  Also  it  is  easy  to  show  that  if  one  modifies  the  definition 
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°f  ^(A.8)  in  an  appropriate  way  so  as  to  allow  the  r.v. 's  X  and  Y  to 

9 

1  CO 

take  their  values  in  the  Banach  spaces  £  and£  respectively,  then  one 
obtains  a  measure  of  dependence  that  is  within  a  positive  constant  factor 
of  8(A,8)  in  eqn.  (1.6)  (see  [22,  Section  2.2]). 

If  0  <  rQ,  Tj,  sQ,  Sj  <  1,  0  <  6  <  1,  r:  =  (1  -  e)rQ  +  6^,  and 
s:  =  (1  -  6)Sq  +  0s^,  then  for  any  two  a-fields  A  and  8, 


a  (A, 8)  s  [a  (A,B)]1_e*[a  (A,8)]9, 

r,s  r0,s0  V  1 

Rl/r,l/s‘A-6>  5  [Rl/r0.l/s0(A'B)J1'e-'Rl/r1,l/s/A-B)le- 


(1.12) 


(1.13) 


Eqn.  (1.12)  has  a  trivial  one-line  proof  and  is  useful  for  comparing  the 

various  measures  of  dependence  a  (see  [21,  Theorems  3.1,  3.2,  and 

r,  s 

4.1  (i)  (ii) 3)  -  E(ln*  (1-13)  is  an  application  of  Thorin's  multilinear  version 
of  the  Riesz-Thorin  interpolation  theorem  (see  e.g.  [5,  p.  18,  Exercise  13]); 
eqn.  (1.13)  and  variants  of  it  are  useful  for  comparing  various  measures  of 
dependence  and  for  studying  the  relations  between  them  (see  [67,  Chapter  7] 
[56,  Lemma  1]  [21]  [22]).  For  example,  as  a  consequence  of  (1.13)  one  can 
show  that  if  lsp,q,t<°°  and  1/p  +  1/q  +  1/t  =  1,  then  for  any  two  o-fields 
A  and  8, 

Rp>q(A,8)  s  (2tt)  •  la(A,8)  ] 1/1  •  [<f  (A ,8)  ]  1/P  •  [4>rey(A ,8)  ] 1/q 

(see  [21,  Theorem  l.l(i)J).  Except  for  a  constant  factor,  this  inequality 

covers  some  other  previously  known  ones  as  special  cases,  including  eqn. 

(1.9).  In  an  obvious  way,  a  "small"  upper  bound  on  R  (A,B)  might  lead  to 

P>9 

a  "small"  value  of  Cov(X,Y)  if,  say,  X  and  Y  are  r.v.'s  which  are  A-measura- 
ble  and  8-measurable  respectively.  Such  bounds  are  often  useful  in  the 
proofs  of  limit  theorems  for  dependent  random  variables. 


f.  •*. 

•*. 


, 


-  ■  V  -  •  -V' */vv 


Further  information  about  measures  of  dependence  can  be  gained  from 
the  use  of  other  methods  and  results  in  interpolation  theory,  such  as  the 
techniques  in  the  Marcinkiewicz  interpolation  theorem  and  the  Stein-Weiss 
[72]  methods  for  handling  indicator  functions.  That  observation  is  due  to 
W.  Bryc.  For  example,  using  such  techniques  Bryc  proved  that  if 
l<p,q<00  and  1/p  +  1/q  =  1,  then  for  any  two  a-fields  A  and  8, 

Rp  q(A,B)  s  C  •o,/pjl/q(A,8)  •  [1  -  log  o.1/p  l/q(A,B)]  (1. 

where  C  is  a  positive  constant  that  depends  only  on  p  and  q  (see  [21, 

Theorem  4.1(vi)]).  (For  the  case  p  =  q  =  2,  (1.14)  improved  a  similar  but 

much  weaker  inequality  established  in  [15]  by  different  methods;  see  also 

[24].)  For  any  choice  of  p  and  q  meeting  the  given  specifications,  (1.14) 

is  within  a  constant  factor  of  being  sharp  (see  [22,  Section  1.1]). 

Let  us  say  that  two  measures  of  dependence  are  "equivalent"  if  each 

one  becomes  arbitrarily  small  as  the  other  one  becomes  sufficiently  small. 

Among  the  measures  of  dependence  a  ,  0Sr,s<l,  and  R  ,  l<p,q<°°, 

r ,  s  p ,  q 

there  are  only  five  equivalence  classes  which  consist  of  more  than  one 
member: 

(i)  a  ,  r  +  s  <  1,  R  ,  1/p  +  1/q  <  1; 
v  '  r,s’  p,q  r  n 

(ii)  a  ,  0  <  r  <  1,  R  , ,  1  <  p  <  °°  (where  p'  is  defined  by 
r,  l- r  P i P 

1/p  +  1/p  *  =  1) ; 

(iii)  «lj0,  Rj  .; 

(iv)  ao,r  R~,i: 

R 


(v)  a,  , . 


1  1 


(which  are  equal  by  a  simple  argument) . 


These  five  classes  each  contain  one  of  the  measures  of  dependence  in  eqns 
(l.l)-(l.S).  The  measure  $  in  (1.6)  is  not  equivalent  to  any  of  the  mea¬ 
sures  a  or  R  .  All  of  this  is  explained  in  [21,  Remark  4.1]  and 

T,  S  P  9  M 

[22,  Sections  1.2  and  2.2]. 
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2.  FIVE  STRONG  MIXING  CONDITIONS 

Henceforth  all  random  variables  are  real-valued.  For  any  family 
(Y  ,  s  £  S)  of  random  variables  (where  S  is  an  index  set),  the  notation 
a(Ys>  s  £ S)  will  mean  the  a-field  generated  by  this  family,  i.e.  the 
smallest  a-field  containing  the  events  (Yg  <r),  s  £  S,  relR, 

Suppose  (X^,  k e Z)  is  a  sequence  of  random  variables.  (No  assumption 
of  stationarity  is  made  yet.)  For  -°°SJ<L<°°  define  Fy.  =  oiX^,  J  £  k  <  L) . 
Referring  to  eqns.  (1.1)-(1.6),  for  each  n  =  l,2,3,...  define 


a(n)  : 

=  suPj£2a(F-«* 

_O0 

FJ+n^ ’ 

(|>(n) : 

=  suPj£2^F-~> 

OO 

w* 

<J>(n) : 

=  SUPjeZ^(F-c»« 

00 

w > 

p(n) : 

=  supJeZp(F^, 

OO 

FJ+n) ’ 

Bin): 

=  suPj£Z6tF-co’ 

OO 

The  sequence  (X^)  is  said  to  be 

strongly  mixing  [66]  if  lim  oi(n)  =  0, 

ir*30 

<}>-mixing  [46]  if  lim^^cKn)  =  0, 

^-mixing  ([8]  essentially)  if  lim  *Kn)  =  0, 

II"*00 

p-mixing  [52]  i'  lim  o(n)  =  0, 

n-w 

absolutely  regular  [73]  if  lim  i3(n)  =  0. 

These  are  the  five  conditions  on  which  we  shall  focus.  The  ^-mixing  condition 
actually  evolved  from  the  "*-mixing"  condition,  which  wa.  the  condition  studied 
in  [8]:  lim  supTi()(F^  ,  F^+n)  =  0.  The  maximal  correlation  coefficient 
p(A,B)  was  studied  in  [44]  [39],  much  earlier  than  the  p-mixing  condition. 

The  absolute  regularity  condition  was  attributed  in  [73]  to  Kolmogorov. 


Several  minor  comments  are  in  order: 


(i)  In  defining  the  strong  mixing  condition  (ot(n)  -*-0)  for  a  "singly- 
infinite,:  sequence  (X^,  k  =  1,2,3,...)  one  modifies  the  definition  of  a(n) 

J  OO 

as  follows:  a(n) :  =  supj>^a(F^,  FJ+n) . 

(ii)  In  defining  the  strong  mixing  condition  for  a  strictly  stationary 
doubly-infinite  sequence  (X^,  k  e  Z)  one  can  simply  define  a(n)  by 
a(n):  =  atF^,  F*) . 

(iii)  In  whatever  context  one  is  dealing  with,  the  sequence  of  numbers 
a(l) ,  a(2),  a(3),...  is  obviously  automatically  non-increasing. 

(iv)  If  a  given  random  sequence  (X^,  keZ)  is  strongly  mixing,  and  for 
each  keZ,  f^:  IR-*-E  is  a  Borel-measurable  function,  then  the  random  sequence 
(fk(XR),  k  €  Z)  is  also  obviously  strongly  mixing,  with  the  dependence  coef¬ 
ficients  a(n),  n  =  l,2,...  for  the  sequence  (f.  (X  ))  being  no  greater  than 

k  k 

the  corresponding  ones  for  (Xk)  . 

(v)  If  a  strictly  stationary  strongly  mixing  singly- infinite  sequence 
(Xj,  X2,  X^, • • • )  is  extended  to  a  strictly  stationary  doubly-infinite 
sequence  (Xk>  keZ),  then  this  new  doubly-infinite  sequence  is  also  strongly 
mixing,  with  precisely  the  same  dependence  coefficients  a(n) ,  n  =  l,2,.... 

(vi)  Comments  (i)-(v)  carry  over  verbatim  to  the  other  mixing  conditions 
defined  above  (^-mixing,  ^-mixing,  p-mixing,  and  absolute  regularity)  and 
their  dependence  coefficients. 

By  eqns.  (1.7),  (1.8),  (1.9),  and  (1.11)  the  following  implications  hold 
for  a  given  random  sequence: 

(i)  p-mixing  =>  strong  mixing. 

(ii)  absolute  regularity  =*>  strong  mixing. 

(iii)  4>-mixing  =>  p-mixing  and  absolute  regularity. 

(iv)  ifj-mixing  =>  <}>-mixing. 


Among  these  five  mixing  conditions  there  are  (.aside  from  transitivity)  no 
other  general  implications.  (For  special  families  of  random  sequences, 
however,  e.g.  Gaussian  sequences,  discrete  Markov  chains,  etc.,  there  are 
other  implications;  this  will  be  seen  in  more  detail  in  Sections  4  and  5 
later  on.)  Since  "strong  mixing"  is  the  weakest  of  these  five  conditions, 
these  conditions  --  and  others  that  imply  strong  mixing  —  are  sometimes 
referred  to  collectively  as  "strong  mixing  conditions"  (plural) .  The  term 
"strong  mixing  condition"  (singular)  will  refer  to  the  condition  a(n)  ->-0 
as  above.  Of  course  all  of  these  mixing  conditions  are  satisfied  by  se¬ 
quences  of  independent  r.v. 's  and  also  by  m-dependent  sequences.  Other 
examples  will  be  encountered  in  Sections  4,  5,  6,  and  7  later  on.  Later  in 
Section  2  here  the  strong  mixing  condition  will  be  compared  to  standard 
conditions  in  ergodic  theory. 

For  a  given  sequence  (Xk>  keZ)  the  <|>-mixing  condition  is  not  neces¬ 
sarily  preserved  if  the  direction  of  "time"  is  reversed.  Referring  to  eqn. 
(1.3),  define  for  each  n  =  1,2,3,...,  4>rev(n)  =  suPj^rev(Fi»>  Fj+ir>  = 

OO  J 

SupJeZ^^FJ+n’  F  oo^  *  In  [51  >  p.  414]  there  is  an  example  of  a  strictly 

stationary  countable-state  Markov  chain  (X^,  keZ)  such  that  <J»(n)  ->0  as 

n-*-°°  and  4>rev(n)  =  1  for  all  n£l.  "Symmetric"  versions  of  the  <|>-mixing 

condition,  putting  equal  emphasis  on  <p (n )  and  <{>  (n) ,  have  been  useful  in 

rev 

limit  theory  (see  [34]  [59]). 

In  the  rest  of  Section  2,  and  also  in  most  of  the  rest  of  this  paper, 
we  shall  deal  only  with  a  strictly  stationary  doubly-infinite  sequence 
(Xk,  k  eZ). 

For  measure-theoretic  convenience,  for  the  rest  of  Section  2  we  shall 
assume  that  our  probability  space  is  (F2,^,?).  (In  a  context  such  as 
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this,  the  symbol  8  is  intended  to  mean  the  standard  Borel  o-field.)  We 
shall  of  course  assume  that  for  each  we  F^  and  each  keZ,  X^(w):  = 

(The  notation  means  the  k**1  coordinate  of  u>.)  We  shall  use  a  regular 
conditional  distribution  of  (Xj,  X2,  X3,...)  given  (XQ,  X  ^ ,  X 
In  this  context  a  standard  measure-theoretic  argument  will  show  that, 
under  our  assumption  that  (X^)  is  strictly  stationary,  for  each  nil, 

<J>(n)  =  ess  sup[sup|P(B|F°J  -  P(B)|,  B  e  F“] , 

and 

3(n)  =  ElsuplPCBlF^)  -  P(B)|,  B  e  F~] . 


There  is  another  useful  formulation  of  g(n) .  Let  Q  denote  the  probability 

2,-2,  0  00 

measure  on  (F  ,o  )  such  that  (i)  under  Q  the  a-fields  F  ^  and  F^  are  inde- 

0  oo 

pendent  and  (ii)  on  each  of  these  two  a-fields  F  and  F.  the  measure  Q  is 
identical  to  P.  Then  (under  our  assumption  of  strict  stationarity) , 


3(n)  =  sup | P (D)  -  Q(D)|,  D  e  F°  v  F 


(2.1) 


where  Pn  (resp.  C^)  is  the  restriction  of  P  (resp.  Q)  to  F0^  v  F^  and  ||*|| 
denotes  total  variation. 

TL  Z 

Let  T  denote  the  usual  shift  operator  on  F  ;  that  is,  for  each  w  e  F  , 

Tta  is  defined  by  (To))^:  =  w]<+1  for  all  k  eZ.  For  any  event  (  =  B2) 

we  use  the  notation  TA:  =  {w:  T  *weA}.  Our  (strictly  stationary)  sequence 

(Xk)  is  said  to  be  "mixing",  or  "mixing  in  the  ergodic- theoretic  sense",  if 

for  all  A,B  e  F  oo,  lim^ )00P(A  n  TnB)  =  P(A)*P(B).  (In  ergodic  theory  this 

condition  is  sometimes  referred  to  as  "strong  mixing",  but  we  shall  use  the 

term  "strong  mixing"  for  the  condition  a(n)  >0  as  before.)  Our  sequence 

(X.  J  is  said  to  be  "regular"  if  its  past  tail  a-field  n-F~n 
k  n= x 


is  trivial 


(i.e.  contains  only  events  of  probability  0  or  1) .  It  is  well  known  that 

(i)  mixing  (in  the  ergodic-theoretic  sense)  ==>  ergodic, 

(ii)  regular  mixing  (in  the  ergodic-theoretic  sense) ,  and 

(iii)  strong  mixing  (a(n)  -*■())  =>  regular. 

Statements  (ii)  and  (iii)  are  easy  consequences  of  [47,  p.  302,  Theorem 
17.1.1J.  Naturally,  in  (ii)  and  (iii)  one  can  replace  "regular"  by  the 

00  oo 

condition  that  the  future  tail  a-field  nQjFn  be  trivial.  (In  Example  6.2 

in  Section  6  we  shall  encounter  a  well  known  stationary  regular  sequence 

whose  future  tail  o- field  fails  to  be  trivial.) 

If  (X^)  is  strictly  stationary  and  absolutely  regular,  then  its  double 

tail  a-field  n,  (F"n  v  F°°)  is  trivial  (i.e.  P(D)  =  0  or  1  for  every  D  in 
n=i  -°°  n 

the  double  tail  a-field).  This  holds  by  (2.1)  and  an  elementary  measure- 
theoretic  argument  (one  can  use  e.g.  [74,  Lemma  4.3]).  In  [19]  a  strictly 
stationary  p-mixing  sequence  is  constructed  for  which  the  double  tail 
a-field  fails  to  be  trivial. 

Let  us  briefly  give  references  for  several  other  related  mixing  con¬ 
ditions,  for  strictly  stationary  sequences.  The  "information  regularity" 
condition  (see  [65])  is  like  the  strong  mixing  conditions  defined  above, 
using  the  "coefficient  of  information"  as  the  basic  measure  of  dependence. 

A  "Cesaro"  variant  of  strong  mixing,  known  as  "uniform  ergodicity",  was 
studied  by  Cogbum  [25];  and  Rosenblatt  [68,  Theorem  2]  established  a 
nice  connection  between  this  condition  and  the  strong  mixing  condition 
itself.  Another  mixing  condition  weaker  than  strong  mixing  has  played  a 
nice  role  in  extreme  value  theory  (see  e.g.  [54])  as  well  as  in  convergence 
in  distribution  to  non-normal  stable  laws  (see  [29]).  A  mixing  condition 
based  on  characteristic  functions  was  studied  in  [75].  Finally,  by  a 
theorem  of  Omstein,  a  condition  of  weak  dependence  known  as  the  "very 


weak  Bernoulli"  condition  characterizes  the  strictly  stationary  finite-state 
sequences  that  are  isomorphic  to  a  Bernoulli  shift.  For  more  information  on 
the  very  weak  Bernoulli  condition,  including  recent  generalizations  of  it  to 
stationary  real  sequences  in  connection  with  central  limit  theory,  see  [71] 
[36]  [32]  [17]  and  the  references  therein. 


3.  MIXING  CONDITIONS  FOR  TWO  OR  MORE  SEQUENCES 


Suppose  (X^,  keZ)  and  (Y^,  keZ)  are  strongly  mixing  sequences  that 
are  independent  of  each  other.  Then  the  sequence  of  random  vectors 
((X^.Y^),  keZ)  is  strongly  mixing.  Hence  the  sequence  of  sums  (X^+Y^,  k  e  Z) 
is  also  strongly  mixing.  The  same  comments  apply  to  the  other  mixing  condi¬ 
tions  being  discussed  here.  Pinsker  [65,  p.  73]  noted  this  for  absolute 
regularity.  Under  natural  extra  restrictions,  such  comments  can  be  extended 
from  two  to  countably  many  sequences  that  are  independent  of  each  other. 

Here  we  shall  just  present  the  basic  propositions  from  which  all  of  these 
comments  can  easily  be  deduced. 

The  first  result  is  due  to  Csaki  and  Fischer  [27,  p.  40,  Theorem  6.2]: 


Theorem  3.1  (Csaki  and  Fischer):  Suppose  A^  and  8n,  n  =  1,2,3,...  are  o-fields 
and  the  o-fields  CAn  v 8n) ,  n  =  1,2,3,...  are  independent.  Then 


P(  v  A 
n=l 


v  B  )  = 

i  n 

n=l 


sup  ^,p(A  ,8  ) . 
r  n^l  n  n 


For  a  short  proof  see  Witsenhausen  [76,  Theorem  1].  In  Example  4.4  in  the 
next  section  an  interesting  application  of  Theorem  3.1  will  be  given.  For 
the  other  dependence  coefficients,  slightly  weaker  statements  hold: 


Theorem  3.2:  If  the  hypothesis  of  Theorem  3.1  is  satisfied,  then  the 
following  statements  hold: 


[i)  a(  v  Ar 
n=l 


n=l 


8P 


n=l 


(ii)  6(  "  A  , 

—  —I  11 


$ 
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00 


(iii) 

<P(  *  A 

<8 

> 

' — ' 

s  V(A" 

•V- 

n=l 

n=i 

n=l 

(iv) 

4>(  v  An, 
n=l 

<30 

v  8  ) 
n=l  n 

00 

s  (  n  [i 

n=l 

*  *Wn-8n)):i  '  1 

Statements  (i)-(iii)  can  be  found  in  [12,  Lemma  8]  ,  [20,  Lemma  2.2],  and 
[11,  Lemma  2.2],  Statements  (ii)-(iii)  can  also  be  derived  easily  from 
[37,  Lemma  1].  Statement  (iv)  is  an  elementary  consequence  of  [14,  Lemma  1] 


4.  MIXING  CONDITIONS  FOR  MARKOV  CHAINS 


Here  a  brief  discussion  is  given  of  Markov  chains  satisfying  strong 
mixing  conditions.  For  a  more  thorough  discussion  of  this  topic  see  Rosen¬ 
blatt  [67,  Chapter  7]. 

The  following  theorem  is  fundamental  to  the  study  of  mixing  conditions 
on  Markov  chains: 

Theorem  4.1.  Suppose  (X^,  k  el)  is  a  strictly  stationary  real  Markov  chain. 
Then  for  all  nil  the  following  five  statements  hold: 

(i)  cx(n)  =  a(0(XQ),  a(Xn)). 

(ii)  p(n)  =  p(o(XQ),  a(Xn)). 

(iii)  BOO  =  B(o(Xq),  o(Xn)). 

(iv)  <Kn)  =  4>(o(X0),  a(Xn)). 

(v)  ipCn)  =  i/»(o(X0),  a(Xn)). 


The  proof  is  an  elementary  measure-theoretic  exercise  using  the  Markov  property. 
For  example,  see  [8,  Lemma  8]  for  a  proof  of  (v) .  (Thus  for  Markov  chains, 
-mixing  is  equivalent  to  the  "*-mixing"  condition  studied  in  [8]).  As  a 
consequence  of  (iv) ,  for  Markov  chains  the  ({(-mixing  condition  is  equivalent 
to  Doeblin's  condition  (see  [67,  p.  212,  eqn.  (18)]). 

For  the  next  theorem  we  shall  use  the  following  terminology:  A  sequence 
of  non-negative  numbers  a^,  a^,  a^,...  is  said  to  "converge  to  0  exponentially 
fast"  if  there  exists  a  positive  number  r  such  that  an  =  0(e_rn)  as  n  +  ®. 

Theorem  4.2.  Suppose  (X^,  keZ)  is  a  strictly  stationary  real  Markov  chain. 
Then  the  following  three  statements  hold: 


(i)  If  p(n)  -*-0,  then  p(n) ->-0  exponentially  fast. 

(ii)  If  <|>(n)  -*-0,  then  <J>(n)  -*-0  exponentially  fast. 

(iii)  If  ip(n)  -*-0,  then  ijj(n)  -*-0  exponentially  fast. 

Part  (iii)  was  proved  in  [8,  pp.  8-9,  Theorem  5].  The  arguments  for  parts 
(i)  and  (ii}  are  similar.  (For  part  (i)  a  simple  argument  using  (1.10) 
and  the  Markov  property  will  show  the  well  known  inequality  p(m  +  n)  £p(m)»p(n) 
for  all  positive  integers  m  and  n.  For  part  (ii)  see  e.g.  [67,  p.  209,  Lemma 
3].)  Theorem  4.2  does  not  extend  to  either  a(n)  or  $(n) .  As  a  consequence 
of  the  classic  convergence  theorem  for  transition  probabilities,  any  strictly 
stationary  countable-state  irreducible  aperiodic  Markov  chain  is  absolutely 
regular.  Such  Markov  chains  exist  for  which  the  rate  of  convergence  of  a(n) 
(and  hence  also  the  rate  for  g(n))  to  0  is  slower  than  exponential  (see  e.g. 
[30,  Examples  1  and  2]  or  [51,  p.  414,  Corollary  1]).  (By  Theorem  4.2  such 
Markov  chains  cannot  be  p-mixing.)  Of  course  every  stationary  finite- state 
irreducible  aperiodic  Markov  chain  is  ^-mixing  (with  exponential  mixing  rate) . 

A  strictly  stationary  real  Markov  chain  (X^)  is  said  to  be  a  "Harris 
chain"  if  it  has  the  Harris  recurrence  property:  There  exists  a  regular  ver¬ 
sion  of  the  conditional  distribution  of  (Xj ,  X^,  X^,...)  given  X^  such  that 
for  every  xeR,  for  every  Borel  subset  Be  R  such  that  P(X^  €  B)  >0,  one  has 
that  P(XneB  for  infinitely  many  positive  integers  n|XQ  =  x)  =1.  (Thus  every 
stationary  countable -state  irreducible  Markov  chain  is  a  stationary  Harris 
chain  .  Also,  non-  stationary  Harris  chains  will  not  be  discussed  here.) 

It  is  well  known  that  every  stationary  Harris  chain  has  a  well  defined  "period 
p e  {1,2,3, .. .}  (the  chain  is  said  to  be  "aperiodic"  if  p  = 1) .  This  fact  and 
the  next  theorem  can  be  seen  (with  a  little  work)  from  Orey  157,  p.  13, 

Theorem  3.1;  p.  23,  Theorem  5.1;  and  p.  25,  lines  9-13]. 
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Theorem  4.3.  (i)  Every  strictly  stationary  real  aperiodic  Harris  chain  is 

absolutely  regular,  (ii)  More  generally,  for  any  strictly  stationary  real 
Harris  chain,  lim  ^  (J(n)  =  1  -  1/p  where  p  is  the  period. 

A  sequence  (X^)  is  said  to  be  an  "instantaneous  function"  of  a  real 
Markov  chain  (Y^)  if  there  is  a  Borel-measurable  function  f:  1R  -KR  such 
that  for  each  k  eZ  ,  X^  =  f(Y^).  As  a  consequence  of  Theorem  4.3(i),  any 
instantaneous  function  of  a  stationary  real  aperiodic  Harris  chain  is  a 
stationary  absolutely  regular  sequence.  In  14]  there  are  some  stationary  <(>- 
mixing  sequences  which  cannot  be  represented  (.on  any  probability  space)  as 
an  instantaneous  function  of  a  stationary  real  Harris  chain  (periodic  or 
aperiodic).  It  is  apparently  unknown  whether  any  such  sequences  exist  which 
are  ^-mixing  or  even  1-dependent. 

Example  4.4.  Rosenblatt  [67,  p.  214,  line  -3  to  p.  Z15,  line  13]  presents  a 
class  of  stationary  real  Markov  chains  which  are  p-mixing  but  not  absolutely 
regular.  (Consequently,  they  are  not  Harris  chains.)  His  construction  is 
based  on  "random  rotations".  Here  we  shall  construct  one  of  those  examples 
in  a  different  way,  as  an  application  of  Theorem  3.1. 

As  a  preliminary  step,  consider  a  stationary  Markov  chain  (W^,  k  eZ) 
with  two  states  {0,1},  with  invariant  probability  vector  (H,H) ,  and  with 
one-step  transition  probability  matrix  (p^j)  given  by  P0o=Pn  =  3/i»  » 

PlO  =  PqI  =1*'  ^  an  induction  argument,  for  each  n21  the  n-step  transition 

probability  matrix  (p^)  is  given  by  p^  =  pj^  =  (1  +  2“n)/2, 

]?10^  =  Pq"^  =  (1  -  2_n)/2.  A  simple  argument  will  show  that  for  each  nJl, 


Here  the  first 


(Wk)  satisfies  p(n)  =  p(a(WQJ ,  c(Wn))  =  |Corr(W0>Wn) |  =  2_n. 
inequality  comes  from  Theorem  4.1  and  the  second  from  the  elementary  fact 
that  every  function  of  a  two-state  r.v.  is  automatically  an  affine  function. 

Now  let  ,  keZ),  j  =1,2,3,...  be  independent  Markov  chains,  each 

having  the  same  distribution  as  the  Markov  chain  (W^,  keZ)  above.  Define 
the  sequence  (X^,  keZ)  by  X^:  =  .  Up  to  null  sets,  for  each  k, 

o(XR)  =  a(Wk^,  j  =1,2,...)  (i.e.  from  X^  one  can  calculate  the  value  of 
j  =  1,2,3,...  a.s.).  It  follows  that  the  sequence  (X^)  is  a  Markov 
chain,  and  it  is  easy  to  see  that  it  is  strictly  stationary.  Further,  (X^) 
satisfies  p(n)  =  2_nfor  all  nil,  by  Theorem  3.1  and  the  properties  of  (W^) 
above;  and  hence  (X^)  is  p-mixing.  In  the  framework  of  eqn.  (2.1)  one  has 
that  for  each  nil,  by  a  simple  calculation  and  the  strong  law  of  large 
numbers,  PClim^E^W^  V  ^  =  ft)  (  1+2'”))  =  1  and 

QClimj^Z^^-^W^  *  y*  1,  and  hence  P  and  Q  are  mutually  singular  on 
a(XQ,Xn).  Hence  (X^)  satisfies  £(n)  =  1  for  all  nil,  i.e.  (X^)  fails  to 
be  absolutely  regular.  This  completes  Example  4.4. 


5.  MIXING  CONDITIONS  FOR  GAUSSIAN  SEQUENCES 

For  stationary  real  Gaussian  sequences,  a  thorough  discussion  of  the 
various  mixing  conditions  is  given  by  Ibragimov  and  Rozanov  [48]  [49, 

Chapters  4-5].  Theorem  5.1  here  essentially  just  lists  a  few  basic  results 
from  that  discussion. 

Theorem  5.1:  Suppose  (X^,  k  e  Z)  is  a  (non-degenerate)  stationary  real 
Gaussian  sequence.  Then  the  following  four  statements  hold: 

(1)  The  following  two  conditions  are  equivalent: 

(a)  (Xfc)  is  regular. 

(b)  (X^)  has  an  absolutely  continuous  spectral  distribution  function,  and 

/TT 

^log  f (A)dA  >  -°°. 

(2)  The  following  three  conditions  are  equivalent: 

(a)  (X^)  is  strongly  mixing. 

(b)  (Xk)  is  p-mixing. 

(c)  The  spectral  density  f  of  (X^)  can  be  expressed  in  the  form 
f(A)  =  | P(e^)  | ^exp[u(e^)  +  v(e*^)]  where  P  is  a  polynomial,  u  and  v  are 
continuous  real  functions  on  the  unit  circle  (in  the  complex  plane),  and  v 
is  the  conjugate  function  of  v. 

(3)  The  following  two  conditions  are  equivalent: 

(a)  (X^)  is  absolutely  regular. 

(b)  The  spectral  density  f  of  (X^)  can  be  expressed  in  the  form 

f(A)  =  |P(e^)  |^exp[Ej_  ^a^e*^]  (the  sum  converging  in  where  P 

is  a  polynomial  whose  roots  (if  there  are  any)  lie  on  the  unit  circle  and 


(4)  The  following  four  conditions  are  equivalent: 

(a)  (Xk)  is  4>-mixing. 

(b)  (X^)  is  ^-mixing. 

(c)  (X^)  is  m-dependent. 

(d)  The  spectral  density  f  of  (X^)  can  be  expressed  in  the  form 
f(X)  =  |P(e^)|2  where  P  is  a  polynomial. 

A  few  comments  are  in  order.  If  f  is  the  spectral  density  of  a  station¬ 
ary  real  (not  complex)  Gaussian  sequence  then  of  course  f  is  symmetric  about 
0.  In  connection  with  statement  (2),  one  has  (for  Gaussian  sequences)  that 
p(n)  s  2ir»a(n)  and  that  p(n)  is  identical  to  the  supremum  of  |Corr(Y,Z)j 

taken  over  all  finite  linear  combinations  Y  =  a-X.  +  a  ,X  .+...+  a  ,.X  .. 

U  U  -  X  - 1  -M  -M 

and  Z  =  a  X  +  a  ,X  .+...+  a  ..X  (see  [52,  Theorems  1  and  21).  In 
n  n  n+1  n+1  n+N  n+N  1  J' 

(2)  the  equivalence  of  (c)  with  (b)  comes  from  the  formulation  of  the 

Helson-Sarason  theorem  given  in  [70,  p.  62],  From  (2)  and  (3)  we  see  that 

in  order  to  construct  a  stationary  real  Gaussian  sequence  which  is  p-mixing 

but  not  absolutely  regular,  one  can  simply  choose  a  spectral  density  which 

is  positive  and  continuous  but  very  "jagged",  such  as 
00  -  i  2  i 

f(A)  =  exp[Ej_j2  Jcos(2  A) J .  For  a  stationary  real  p-mixing  (or  even 
absolutely  regular)  Gaussian  sequence  the  spectral  density  need  not  be  con¬ 
tinuous  or  even  bounded;  consider  an  example  with  spectral  density 
00—1 

f(A)  *  exp[E.  ~(j  log  j)  cos(jA)]  (which  satisfies  lim._^f(X)  =  +°°  by 
[77,  p.  188,  Theorem  2.15]).  For  more  examples  see  [49,  pp.  179-180].  For 
part  (4)  one  can  use  the  Wold  decomposition  theorem  to  show  that  any  sta¬ 
tionary  real  <J»-mixing  Gaussian  sequence  must  be  a  moving  average  of  i.i.d. 
Gaussian  r.v.'s;  one  uses  the  fact  that  if  Y  and  Z  are  jointly  Gaussian 
r.v.'s  with  Corr(Y,Z)  f  0  then  q> (ct ( Y) ,  o(Z))  =  1.  To  see  this  latter  fact, 


say  in  the  case  where  Corr(Y,Z)  >0,  one  can  first  note  that  P(Z>q)  becomes 
arbitrarily  small  as  q  >  0  becomes  sufficiently  large,  and  then  for  q  fixed, 
P(Z>q|Y>r)  becomes  arbitrarily  close  to  1  as  r>0  becomes  sufficiently 
large. 

Ibragimov  and  Rozanov  [48]  [49,  p.  182,  Lemna  17,  and  p.  190,  Note  2] 

.  oo  n 

proved  that  every  stationary  Gaussian  sequence  satisfying  Inlp(2  )  <°°  has 
a  continuous  spectral  density  f(A);  they  derived  for  each  n  ^ 1  an  upper 
bound  on  the  "uniform  error"  of  the  "best"  approximation  of  f  by  a  trigo¬ 
nometric  polynomial  of  degree  <  n.  Their  result  introduced  the  (logarith- 

°°  n 

mic)  mixing  rate  Zn=1p(2  )  <  °°  into  the  literature,  along  with  some  of  the 
techniques  for  handling  this  mixing  rate.  In  central  limit  theory  this 
mixing  rate  has  turned  out  to  be  quite  prominent  for  p-mixing  (see  e.g. 

[60]  [58]  [38]  and  the  references  therein). 


6.  SOME  OTHER  SPECIAL  EXAMPLES 


Here  we  shall  briefly  describe  the  strong  mixing  properties  of  a  few 
stationary  sequences  that  arise  in  other  areas  such  as  time  series  analysis, 
number  theory,  and  interacting  particle  systems.  (There  will  be  a  slight 
overlap  with  the  Markov  chains  and  Gaussian  sequences  studied  in  Sections 
4  and  5.)  In  most  of  these  examples  we  shall  encounter  strong  mixing  con¬ 
ditions  with  exponential  mixing  rates  --  a  context  in  which  the  known  limit 
theory  under  strong  mixing  conditions  usually  applies  very  nicely.  However, 
in  Example  6.2  below,  we  shall  also  look  at  a  well  known  simple  stationary 
AR(1)  process  (autoregressive  process  of  order  1)  which  fails  to  be  strongly 
mixing. 


Example  6.1.  Suppose  (Z^,  keZ)  is  an  i.i.d.  sequence  and  the  marginal 

distribution  of  ZQ  is  absolutely  continuous  with  a  density  which  (to  start 

with  a  few  nice  specific  cases)  is  Gaussian,  Cauchy,  exponential,  or  uniform 

(on  some  interval).  Suppose  a^,  a2>  a^,...  is  a  sequence  of  real  numbers 

with  | a  | -*0  exponentially  fast.  Then  the  random  sequence  (X,,  k  e Z)  defined 
n  k 

by 


V 


l  aiZk-i 
j=0  3  K  3 


is  well  defined,  strictly  stationary,  and  satisfies  absolute  regularity  with 

exponential  mixing  rate.  In  particular  this  includes  the  cases  where  (X^) 

is  a  stationary  ARMA  (autoregressive-mixed-moving-average)  process  based  on 

the  i.i.d.  sequence  (Z.)  given  above.  The  conditions  on  (a  )  can  be  relaxed 

k  n 

somewhat;  if  an'*‘°  at  a  sufficiently  fast  polynomial  rate,  then  (X^)  will 
still  satisfy  absolute  regularity,  but  the  mixing  rate  may  be  slower  than 


exponential.  Of  course  all  these  statements  apply  to  a  much  broader  class  of 


density  functions  for  Zq  than  just  the  ones  given  above.  For  details  see  [61] 
and  the  references  therein.  As  the  next  example  will  show,  the  above  results 
in  general  do  not  carry  over  to  the  case  where  the  distribution  of  is 

discrete. 

Example  6.2.  The  following  example  is  well  known;  see  e.g.  [69,  p.  267]. 
Suppose  (Z^,  keZ)  is  i.i.d.  with  P(Z^  =  0)  =  P(Z^  =  1)  =  Define  the  sequence 
(Xk,  k  e  Z)  by 

V  =  ft)Zk  *  (WZfc.j  -  (*«V2  *  <‘«Zk-3  *  ••• 

Then  (X^)  is  a  strictly  stationary  AR(1)  process;  it  can  be  represented  by 
*k  =  +  Ps)^*  For  eac^  ^  t^ie  r,v*  \  is  uniformly  distributed  on 

the  interval  [0,1].  (Note  that  the  digits  in  the  binary  expansion  of 
are  Z^,  For  eac^  k  one  also  has  that  X^  is  a  (Borel -measura¬ 

ble)  function  of  X^+1  (up  to  null  sets):  X^  =  (fractional  part  of  2X^+1)  a.s. 
As  a  consequence  of  all  this,  for  each  nil,  XQ  is  a  (Borel-measurable) 
function  of  Xn  (by  induction)  and  hence 

a(n)  i  a(a(XQ),  a(Xn))  >  a(a(X0),  o(XQ)) 

*  P(X0<ls)  -  [P(X0<Js)]2  =  ^  . 

Hence  (X^)  fails  to  be  strongly  mixing;  in  fact  a(n)  =  \  for  all  nil  by 
(1.11).  Indeed,  even  though  (X^)  is  regular,  its  future  tail  o-field 

CO  CO  oo 

n.F  is  non-trivial  (it  coincides  with  F  up  to  null  sets) .  Of  course 
n=  l  n  _oo  r 

one  can  make  this  example  symmetric  about  0  by  replacing  Z^  by  Z^  - 


.  •  V  V  * 

y  J  •  '•"p  -T  « 

kwA 


■r.  ■ 

Mm* 


■''‘.'.A/' 


23 


Example  6.3.  This  is  another  well  known  example,  related  to  number  theory. 
For  every  irrational  number  xe  (0,1)  there  exists  a  unique  sequence  of 
positive  integers  x^,  x^,  x^,...  such  that  the  following  "continued  fraction" 
expansion  holds: 


Suppose  we  impose  the  Gauss  probability  measure  on  [0,1],  namely  the  measure 
which  is  absolutely  continuous  with  density  f(x):  =  (log  2)"**  (1  +x) In 
this  context  the  sequence  (x^,  x^,  x^,...)  is  a  strictly  stationary  sequence 
of  random  variables,  and  it  is  H>-mixing  with  exponential  mixing  rate  (see 
[62,  p.  450,  Corollary  1]).  For  applications  of  limit  theorems  under  strong 
mixing  conditions  to  number  theory  (including  this  continued  fraction  expan¬ 
sion)  ,  see  [62]  [63]. 


Example  6.4.  There  has  been  a  lot  of  research  on  the  ergodic-theoretic 

properties  of  mappings  T:  [0,1]  [0,1].  See  e.g.  [55]  as  well  as  the 

references  given  below.  Example  6.3  above  fits  into  this  framework  in  a 

natural  way  (see  e.g.  [62,  pp.  448-449]).  Here  we  shall  just  describe  one 

h 

other  simple  example.  Suppose  2  <\<,  2.  Consider  the  mapping 
T:  [0,1]  -*■  [0,1]  defined  by 


T(x): 


Xx  +  2  -  X 
-Xx  +  X 


if  0  s  x  s  1  -  1/X 
if  1  -  1/X  s  x  s  1 


The  graph  of  this  function  looks  somewhat  like  an  inverted,  chopped  letter 
V  with  apex  at  the  point  (1  -  1/X,  1).  By  [53,  Theorem  1]  there  exists  on 


[0,1]  an  absolutely  continuous  probability  measure  y  which  is  T- invariant. 
In  [45,  Theorem  l(vi)]  a  "canonical"  method  is  described  for  defining  y. 
Under  this  measure  y  the  transformation  T  is  "weak  mixing"  by  [10,  Theorem 
2(ii)].  (We  need  not  give  the  definition  of  "weak  mixing"  here.)  Now  let 
I,,  I0,...,  I„  be  an  arbitrary  partition  of  [0,1]  into  finitely  many  inter- 
vals,  and  define  on  ([0,1],  B  ^  ^,y)  t*ie  (strictly  stationary)  sequence 
(XR,  k  = 1,2,3, .. .)  by 

Xk(x):  =  m  if  Tk(x)  £  I  ,  m  =  l,2,...,  M. 

By  [45,  p.  132,  lines  4-6],  (X^j  is  absolutely  regular  with  exponential 
mixing  rate. 


Example  6.5.  Gibbs  measures  have  sometimes  been  used  in  the  study  of  inter¬ 
acting  particle  systems.  We  shall  discuss  Gibbs  measures  in  just  the 
simplest  context.  Suppose  4>:  {0,1}  +  R  is  a  function  with  the  following 

restriction  on  its  "variation":  3a>0,  3C>0,  such  that  for  all  m  =1,2,3,.. 

Z 


[sup{|$(x)  -  $(y)  |  :  x,y€{0,l/  such  that  =  y^  Vk  =  -m,  -m+  1, . . . ,  m}] 


s  Ce 


-am 


th  Z 

(where  the  k  coordinate  of  any  X£{0,1/  is  denoted  x^)  .  Then  there  exists 


a  unique  shift-invariant  probability  measure  y  on  {0,1}  with  the  following 


property:  3  >  0,  3  >  0,  3  q  e  1R  such  that  for  all  x  e  {0,1}  ,  for  all  mil, 


y{y:  y,  =  x,  for  all  k=0,l,...,  m-l} 

C  <  - t - * - - -  <  c. 


exp[-qm  +  ^0sk5m.^(T  (x))] 


Z 


where  T  is  the  usual  shift  operator  on  {0,1}  .  This  measure  y  is  called  the 


"Gibbs  measure"  based  on  the  function  4>.  (For  details  see  e.g.  [9]  [35].) 

On  the  probability  space  ({0,1}^,  «fo.D  ,y)  define  the  (strictly  stationary) 

random  sequence  (X^,  k  e Z)  by  X^(u):  =  cu^.  One  can  interpret  X^  = 0  resp.  1 
th 

to  mean  that  the  k  site  is  empty  resp.  occupied  by  a  particle.  This 
sequence  (X^)  is  ^-mixing  with  exponential  mixing  rate  (see  [9,  p.  24]  or 


7.  THE  BEHAVIOR  OF  THE  DEPENDENCE  COEFFICIENTS 


First  we  shall  examine  the  possible  limit  values  of  the  dependence  coef¬ 
ficients  . 

Theorem  7.1:  If  (X^,  keZ)  is  strictly  stationary  and  mixing  in  the  ergodic 
theoretic  sense,  then  the  following  three  statements  hold: 

(i)  Either  B(n)  ->-0  as  n-*“  or  8(n)  =  1  for  all  n£l. 

(ii)  Either  <j>(n)  ■+0  as  n-*-00  or  <J>(n)  *  1  for  all  n>l. 

(iii)  Either  tp(n)  -*-0,  ty(n)-*l,  or  i(i(n)  =  00  for  all  nil. 

Statements  (i)  and  (ii)  can  be  found  in  [13,  Theorem  1]  and  [11,  Theorem  1], 
and  statement  (iii)  is  a  trivial  consequence  of  [14,  Theorem  1].  Statement 
(i)  is  a  slight  extension  of  an  earlier  result  of  Volkonskii  and  Rozanov 
[74,  Theorem  4.1],  Statement  (ii)  was  previously  known  for  stationary 
Markov  chains  (see  [67,  p.  209,  Lemma  3])  and  of  course  for  stationary  Gaus¬ 
sian  sequences  (see  Section  6).  Theorem  7.1  does  not  extend  to  either  a(n) 
or  p(n) .  Instead,  for  stationary  regular  sequences,  lim  a(n)  can  be  any 
value  in  [0,*s]  and  lim  p(n)  can  be  any  value  in  [0,1J.  (See  eqn.  (1.11) 
and  [12,  Theorem  6];  regularity  was  not  mentioned  in  this  theorem,  but  is 
an  elementary  property  of  the  construction  given  in  the  proof.)  Berbee 
[2,  Theorem  2.1]  proved  an  analog  of  Theorem  7.1  for  stationary  sequences 
which  are  ergodic  but  not  assumed  to  be  mixing: 

Theorem  7.2  (Berbee):  If  (X^,  keZ)  is  strictly  stationary  and  ergodic, 
then  lim  B(n)  =  1  -  1/p  for  some  p e  {1,2,3, .. .}u{°°}.  If  this  p  satisfies 


2  <  p  < <®,  then  letting  T  denote  the  usual  shift  operator  (on  events  in  F  )  , 

the  invariant  o-field  of  Tp  is  identical  to  each  tail  o-field  of  (X^)  (up  to 

null  sets)  and  is  purely  atomic  with  exactly  p  atoms,  each  having  probability 

2  pi 

l/p  (the  p  atoms  are  A,  TA,  T  A,...,  Tp  A  where  A  is  any  one  of  the  atoms), 
and  conditional  on  any  one  of  these  atoms  the  sequence  of  random  vectors 

(Yk,  kc2)  defined  by  Yk:  =  (X(k_1)p+1,  x(k_1)p+2 . Xkp>  is  strictly 

stationary  anJ  satisfies  the  absolute  regularity  condition. 

Theorem  4.3(ii)  is  in  essence  a  special  case  of  Theorem  7.2.  Also,  as  a  sim¬ 
ple  corollary  of  Theorem  7.2  (after  one  applies  Theorem  7.1(ii)(iii)  to  the 
sequence  (Yk)  there  if  2sp<°°)one  has  the  following  additional  properties  of 
strictly  stationary  ergodic  sequences  (Xk) : 

(i)  <J>(n)  -*•  1  -  1/p  for  some  p  e  {1,2,3, ..  .}u{°°}, 

(ii)  t//(n)  -+p  -  1  for  some  p  £  {l  ,2,3, . . .  }u{°»}. 

(In  particular,  for  example,  if  lim  g(n)  *  1  -  1/p  for  some  (finite)  positive 
integer  p,  then  either  lim  cp (n)  =  1  -  1/p  for  the  same  p  or  else  <p (n)  =  1 
for  all  n.) 

For  strictly  stationary  sequences  there  is  essentially  no  restriction 
on  the  mixing  rates  for  the  mixing  conditions  being  discussed  here.  See 
e.g.  [51,  Theorems  2,3,  and  4],  [49,  pp.  181-190],  [11,  Theorem  2], 

[12,  Theorem  6],  and  [14,  Theorem  2].  Also,  for  most  mixing  rates  used  in 
the  literature  (in  particular  exponential,  polynomial,  or  logarithmic), 
the  mixing  conditions  being  discussed  here  can  all  hold  with  essentially 
the  same  given  rate;  this  is  a  consequence  of  the  following  theorem  which 
(because  of  eqns.  (1.7)  and  (1.8))  is  just  [18,  Theorem  1]: 


Theorem  7.3:  Suppose  g:  [0,®)  ■+  (0,®)  is  a  positive  continuous  non-increasing 
function  such  that  lim  g(x)  =  0,  g(0)  s 1/24,  and  log  g  is  convex  on  [0,®). 
Then  there  exists  a  strictly  stationary  sequence  (X^)  such  that  for  every 
n>l,  (%)g(n)  <a(n) ,  6(n),  P(n),  <J>(n) ,  Y(n)  s 8g(n) . 

(The  conditions  on  g  here  are  of  course  somewhat  redundant.)  Under  fewer 
restrictions  on  g,  Theorem  7.3  was  already  shown  for  non- stationary  se¬ 
quences  by  Kesten  and  O’Brien  [51,  Theorem  1]. 

Theorems  7.1  and  7.2  do  not  extend  to  non-stationary  sequences.  In 
limit  theory  for  non-stationary  sequences,  conditions  such  as  "^(n)  <  ® 
for  some  n”  or  ”$(n)  <1  for  some  n"  are  sometimes  useful  (see  e.g.  [26] 

[58]).  Note  that  if  ij>(n)  <0°  then  (for  the  same  n)  <J>(n)  <1.  (In  fact  it 
is  not  hard  to  show  that  $(A,B)  < 1  if  either  sup  P(Ar>B)/[P(A)P(B)  ]  <®  or 
inf  P(AnB)/[P(A)P(B)  ]  >0,  the  sup  and  inf  being  taken  over  all  Ae  A  and 
B  c  B  with  P(A)  >0  and  P(B)  >0.)  For  non-stationary  sequences,  conditions 
such  as  ip (n)  <®  or  d> (n)  <  1  do  not  impose  any  essential  restrictions  on 
the  moments  of  the  r.v.’s  or  on  the  rate  at  which  a(n)  or  B(n)  or  p(n) 
might  perhaps  converge  to  0,  nor  do  they  preclude  second-order  station- 

arity.  Defining  <t>rev(n):  =  ^PjcZ^rev^®’ ^n5  for  each  n  = 
one  has  the  following  theorem: 

Theorem  7.4:  Suppose  d^,  d^,...  *s  a  non- increasing  sequence  of 

positive  numbers  such  that  lim  d  =0.  Suppose  0<csi.  Then  there 
r  n-*»  n 

exists  a  (not  strictly  stationary)  sequence  (X^,  keZ)  with  the  following 
eight  properties: 


(i)  For  all  keZ,  EXk  =  0  and  EX“  =  1. 


(ii)  For  all  k,  ZeZ  with  k  /  l,  EX^  =  0. 
iii)  For  all  k  e  Z,  |Xk|  <  2  a. s . 

(iv)  For  all  n^l,  'P(n)  =  c. 


(v)  For  all  nil,  4> (n)  =  4>rev(n)  =  c/2, 

(vi)  p(n)  “  d^  as  n->°° 

(vii)  6(n)  dn  as  n  -*■« 

(viii)  a(n)  ^  dn  as  n-*-00 


Here  the  notation  a^b  means  that  a  =  0(b)  and  b  =  0(a).  This  theorem  ap¬ 
parently  has  not  been  published  anywhere;  its  proof  will  be  given  here, 
based  on  an  argument  used  by  Kesten  and  O'Brien  [51,  Theorem  1], 

First  some  preliminary  calculations  are  needed.  Suppose  0<c<l 
and  0<d<!*.  Suppose  U  and  V  are  r.v.'s,  each  with  state  space 
{-2,  -1,  1,  2}  and  marginal  probability  vector  [d,  \  -  d,  -  d,  d] ,  and 
with  joint  probability  function  (f^),  where  f„  :  =  P(U  =  i,  V  =  j),  given 
by  the  matrix 


f  f  f  f 

-2,-2’  -2,-1’  -2,1*  -2,2 

f  f  f  f 

-1,-2’  -1,-1’  -1,1’  -1,2 

f  f  f  f 

1 , -2  ’  1,-1  *  1,1  ’  1,2 

f  f  f  f 

2,-2  ’  2,-1  ’  2,1  ’  2,2 


d(5s  -  d)  (1  -  c) 

d(h  -  d)  (1  +  c) 

.2 


d(»s-  d)  Cl  +c) 
(%  -  d)2 

(*S  "  d)  2 

d(*s  -  d)  (1  -c) 


d(*s-  d)(l  -c) 

(*5  -  d) 2 

(%  -  d)  2 
d(%  -  d)  (1  +c) 


d(*s  -  d) (1  ♦  c) 
d(%-  d)(l  -  c) 

j2 


jnV-'S' 'SS’-V-V-V-'iS' I- *N*’' 


By  elementary  calculations  one  can  show  that  xp(a(U) ,  a(V))  =  c, 

0(o (U) ,  a(V))  =  <J>rev(a(U),  a(V))  =  c(Js  -  d),  3(o(U),  cr(V))  =  4cdps  -  d), 

and  a(o(U) ,  a(V))  >  P(U  =  1  or  2,  V  = -2  or  1)  -P(U  =  1  or  2)P(V*-2  or  1)  * 

2cd0s  -  d) .  By  the  first  equality  in  (1.7),  in  fact 

ot Ccr CU) ,  a(V))  =  2cd(4  -  d) .  Also,  by  eqn.  (1.12), 

u  u 

^(A,8)  <  [a(A,8)  »i)»(A,B)  s  cd  .  Since  U  and  V  each  have  only  four 
states,  p(<r(U) ,  a(V))  s  16a^  ^(A,8)  s  lbcd*4.  Also, 

p(o(U),  o(V))  >  Corr(I^u=2^,  I{V=1})  5  (c/2)d$s  (using  the  fact  that  d  s  »t) . 

Also,  EU  =  EV  =  EUV  =  0,  EU2  =  EV2  >  1,  and  ||u =  HvJJ^  =  2. 

Now  we  mimic  the  construction  in  [51,  Theorem  1].  For  each  m= 1,2,3,... 
2  2 

let  (X^,  k=m  ,  m  +  m)  be  a  random  vector  with  the  same  distribution  as  the 

random  vectors  (U,V)  above,  in  the  case  where  the  parameters  are  c  and  dm 

from  the  hypothesis  of  Theorem  7.4.  We  assume  without  loss  of  generality 

2 

that  d  for  all  m.  For  all  integers  k  that  are  not  of  the  form  m  or 
m 

2 

m  +  m,  m  =  1 ,2, . . . ,  let  X^  be  a  r.v.  such  that  P(X^  =  1)=  P(X^  =  -1)  =  h  •  We 

impose  the  additional  restriction  that  (X,  ,X»),  (X.,X^),  (X-,X,„),  (X,,,X„) 

i  4  4d  yiz  lozu 

and  ...X  2 ,  X^^,  X^,  1 Xg,  ^10*  ^11*  ^13** **  independent.  Fol 
lowing  the  argument  in  [51,  Theorem  1]  we  have  that  for  each  n*  1,2,3,... 

a(n)  =  supmSno(o(X  2) ,  a(X  2  )) 

m  m  +m 

and  the  analogous  statement  holds  for  the  other  dependence  coefficients. 

Using  the  calculations  in  the  preceding  paragraph,  one  can  show  that  the 

sequence  (X^)  constructed  here  satisfies  all  properties  (i)-(viii)  in 

2  2  2 

Theorem  7.4,  except  that  EX^  >  1  f  or  k  =  m  ,  m  +m,  m*l,2,....  Dividing 
each  X^  by  its  standard  deviation,  we  obtain  a  new  sequence  (X^)  satisfying 
all  properties  in  Theorem  7.4.  This  completes  the  proof. 
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8.  APPROXIMATION  OF  MIXING  SEQUENCES  BY  OTHER  RANDOM  SEQUENCES 

In  proving  limit  theorems  for  a  given  sequence  of  random  variables 
satisfying  strong  mixing  conditions,  it  is  often  useful  to  approximate  the 
sequence  by  another  random  sequence  with  certain  properties.  Here  we  shall 
give  some  pertinent  references  on  such  techniques. 

The  technique  of  directly  approximating  a  mixing  sequence  by  a  sequence 
of  martingale  differences  was  introduced  by  Gordin  [40] .  It  has  been  ex¬ 
ploited  by  many  people  (see  e.g.  [64]  [42]  as  well  as  the  exposition  of  it 
in  [50,  pp.  127-131]).  A  long-standing,  previously  unnoticed  error  in  one 
particular  application  of  Gordin's  technique  was  pointed  out  by  Hermdorf 
[43]  (see  [20]  for  further  details). 

The  technique  of  directly  approximating  a  mixing  sequence  by  a  sequence 
of  independent  random  variables  was  introduced  by  Berkes  and  Philipp  [6] [7]. 
This  technique  is  a  versatile  one  which  often  allows  one  to  handle  very  slow 
(logarithmic)  mixing  rates  (see  e.g.  [7]  [28]  [16]),  other  types  of  condi¬ 
tions  of  weak  dependence  besides  the  strong  mixing  conditions  being  dis¬ 
cussed  here  (see  e.g.  [36J),  and  also  random  variables  taking  their  values 
in  general  Banach  spaces  (see  e.g.  [33]  and  the  references  therein).  This 
technique  works  particularly  nicely  with  the  absolute  regularity  condition 
(see  [1,  Corollary  4.2.5]  [3]  [23]  [33]).  Dehling  [31]  exposed  a  weakness 
in  this  technique  under  just  the  strong  mixing  condition  (or  even,  by  the 
same  argument,  the  p-mixing  condition)  when  the  random  variables  are  taking 
their  values  in  general  Banach  spaces. 

For  proving  theorems  in  renewal  theory,  "coupling"  techniques  are  often 
useful.  Berbee  [1,  p.  104,  Theorem  4.4.7J  characterized  the  strictly  sta¬ 
tionary  absolutely  regular  sequences  in  terms  of  a  coupling  property: 


Theorem  8.1  (Berbee) :  A  strictly  stationary  sequence  (X^,  keZ)  of  random 
variables  is  absolutely  regular  if  and  only  if  there  exists  a  probability 
space  with  sequences  (X£,  k  e  Z)  and  (X£,  k  e  Z) ,  each  having  the  same  dis¬ 
tribution  as  (X^ ,  keZ),  such  that  (i)  (X£,  k  <  0)  and  (Xj^,  k  e  Z)  are  inde¬ 
pendent  and  (ii)  P(3n  >  1  such  that  X£  =  X£  for  all  k  >  n)  =  1. 

Berbee  [1,  p.  106]  also  explains  that  if  (X^)  is  absolutely  regular  (i.e. 
satisfies  8(n)->-0),  then  in  the  context  of  Theorem  8.1  the  inequality 

P(3k>n  such  that  X£  +  X£)  2  8(n)  (8.1) 

automatically  holds  for  all  n^l,  and  that  there  exists  an  "optimal"  coupling 
in  which  equality  in  eqn.  (8.1)  is  achieved  simultaneously  for  all  nsi. 

This  is  analogous  to  Griffeath's  [41]  well  known  "maximal  coupling"  result 
for  Markov  chains. 
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